23 research outputs found

    Genetic Analysis of Completely Sequenced Disease-Associated MHC Haplotypes Identifies Shuffling of Segments in Recent Human History

    Get PDF
    The major histocompatibility complex (MHC) is recognised as one of the most important genetic regions in relation to common human disease. Advancement in identification of MHC genes that confer susceptibility to disease requires greater knowledge of sequence variation across the complex. Highly duplicated and polymorphic regions of the human genome such as the MHC are, however, somewhat refractory to some whole-genome analysis methods. To address this issue, we are employing a bacterial artificial chromosome (BAC) cloning strategy to sequence entire MHC haplotypes from consanguineous cell lines as part of the MHC Haplotype Project. Here we present 4.25 Mb of the human haplotype QBL (HLA-A26-B18-Cw5-DR3-DQ2) and compare it with the MHC reference haplotype and with a second haplotype, COX (HLA-A1-B8-Cw7-DR3-DQ2), that shares the same HLA-DRB1, -DQA1, and -DQB1 alleles. We have defined the complete gene, splice variant, and sequence variation contents of all three haplotypes, comprising over 259 annotated loci and over 20,000 single nucleotide polymorphisms (SNPs). Certain coding sequences vary significantly between different haplotypes, making them candidates for functional and disease-association studies. Analysis of the two DR3 haplotypes allowed delineation of the shared sequence between two HLA class II–related haplotypes differing in disease associations and the identification of at least one of the sites that mediated the original recombination event. The levels of variation across the MHC were similar to those seen for other HLA-disparate haplotypes, except for a 158-kb segment that contained the HLA-DRB1, -DQA1, and -DQB1 genes and showed very limited polymorphism compatible with identity-by-descent and relatively recent common ancestry (<3,400 generations). These results indicate that the differential disease associations of these two DR3 haplotypes are due to sequence variation outside this central 158-kb segment, and that shuffling of ancestral blocks via recombination is a potential mechanism whereby certain DR–DQ allelic combinations, which presumably have favoured immunological functions, can spread across haplotypes and populations

    Integrative Annotation of 21,037 Human Genes Validated by Full-Length cDNA Clones

    Get PDF
    The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead

    Integrative annotation of 21,037 human genes validated by full-length cDNA clones.

    Get PDF
    publication en ligne. Article dans revue scientifique avec comité de lecture. nationale.National audienceThe human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology

    Difference in R01 Grant Funding Among Osteopathic and Allopathic Emergency Physicians over the Last Decade

    No full text
    Introduction: Receiving an R01 grant from the National Institutes of Health (NIH) is regarded as amajor accomplishment for the physician researcher and can be used as a means of scholarly activityfor core faculty in emergency medicine (EM). However, the Accreditation Council for GraduateMedical Education requires that a grant must be obtained for it to count towards a core facultymember’s scholarly activity, while the American Osteopathic Association states that an applicationfor a grant would qualify for scholarly activity whether it is received or not. The aim of the study wasto determine if a medical degree disparity exists between those who successfully receive an EM R01grant and those who do not, and to determine the publication characteristics of those recipients.Methods: We queried the NIH RePORTER search engine for those physicians who received anR01 grant in EM. Degree designation was then determined for each grant recipient based on aweb-based search involving the recipient’s name and the location where the grant was awarded.The grant recipient was then queried through PubMed central for the total number of publicationspublished in the decade prior to receiving the grant.Results: We noted a total of 264 R01 grant recipients during the study period; of those who receivedthe award, 78.03% were allopathic physicians. No osteopathic physician had received an R01 grantin EM over the past 10 years. Of those allopathic physicians who received the grant, 44.17% held adual degree. Allopathic physicians had an average of 48.05 publications over the 10 years prior togrant receipt and those with a dual degree had 51.62 publications.Conclusion: Allopathic physicians comprise the majority of those who have received an R01 grantin EM over the last decade. These physicians typically have numerous prior publications and anadvanced degree

    Meeting Highlights: Genome Informatics

    Get PDF
    We bring you the highlights of the second Joint Cold Spring Harbor Laboratory and Wellcome Trust ‘Genome Informatics’ Conference, organized by Ewan Birney, Suzanna Lewis and Lincoln Stein. There were sessions on in silico data discovery, comparative genomics, annotation pipelines, functional genomics and integrative biology. The conference included a keynote address by Sydney Brenner, who was awarded the 2002 Nobel Prize in Physiology or Medicine (jointly with John Sulston and H. Robert Horvitz) a month later

    Identification of Mammalian microRNA Host Genes and Transcription Units

    No full text
    To derive a global perspective on the transcription of microRNAs (miRNAs) in mammals, we annotated the genomic position and context of this class of noncoding RNAs (ncRNAs) in the human and mouse genomes. Of the 232 known mammalian miRNAs, we found that 161 overlap with 123 defined transcription units (TUs). We identified miRNAs within introns of 90 protein-coding genes with a broad spectrum of molecular functions, and in both introns and exons of 66 mRNA-like noncoding RNAs (mlncRNAs). In addition, novel families of miRNAs based on host gene identity were identified. The transcription patterns of all miRNA host genes were curated from a variety of sources illustrating spatial, temporal, and physiological regulation of miRNA expression. These findings strongly suggest that miRNAs are transcribed in parallel with their host transcripts, and that the two different transcription classes of miRNAs (`exonic' and `intronic') identified here may require slightly different mechanisms of biogenesis
    corecore